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ABSTRACT 

Semantic properties are domain-specific specification con- 
structs used to augment an existing language with richer se- 
mantics. These properties are taken advantage of in system 
analysis, design, implementation, testing, and maintenance 
through the use of documentation and source-code transfor- 
mation tools. Semantic properties are themselves specified 
at two levels: loosely with precise natural language, and for- 
mally within the problem domain. The refinement relation- 
ships between these specification levels, as well as between a 
semantic property's use and its realization in program code 
via tools, is specified with a new formal method for reuse 
called kind theory. 



Categories and Subject Descriptors 

D.1.0 [Software]: Programming Techniques — General; D.2 
[Software]: Software Engineering; D.3.1 [Software]: Pro- 
gramming Languages — Formal Definitions and Theory; D.3.2 
[Software]: Programming Languages — Language Classifi- 
cations [design languages]; D.3.4 [Software]: Programming 
Languages — Processors [preprocessors]; F.3.1 [Theory of 
Computation]: Logics and Meanings of Programs — Spec- 
ifying and Verifying and Reasoning about Programs; F.4.1 
[Theory of Computation] : Mathematical Logic and Formal 
Languages — Mathematical Logic; F.4.3 [Theory of Com- 
putation]: Mathematical Logic and Formal Languages — 
Formal Languages 



General Terms 

documentation, semantic properties, specification languages, 
formal methods, kind theory, specification reuse, documen- 
tation reuse 



1. INTRODUCTION 

Ad hoc constructs and local conventions have been used to 
annotate program code since the invention of programming 
languages. The purpose of these annotations is to convey ex- 
tra programmer knowledge to other system developers and 
future maintainers. These comments usually fall into that 
gray region between completely unstructured natural lan- 
guage and formal specification. 

Invariably, such program comments rapidly exhibit "bit rot". 
Over time, these comments, unless well maintained by doc- 
umentation specialists, rigorous process, or other extra-mile 
development efforts, become out-of-date. They are the focus 
for the common mantra: an incorrect comment is worse than 
no comment at all. 

Recently, with the adoption and popularization of lightweight 
documentation tools in the literate programming tradition 1171 
IT8l . an ecology of semi-structured comments is flourishing. 
The rapid adoption and popularity of Java primed interest 
in semi-structured comment use via the Javadoc tool. Other 
similar code-to-documentation transformation tools have since 
followed in volume including Jakarta's Alexandria, Doxy- 
gen, and Apple's HeaderDoc. SourceForg^ reports thirty- 
six projects with "Javadoc" in the project summary. FreshMeafl 
reports another thirty-five, with some overlap. 

While most of these systems are significantly more simple 
than Knuth's original CWEB, they share two key features. 

First, they are easy to learn, since they necessitate only a 
small change in convention and process. Rather than forcing 
the programmer to learn a new language, complex tool, or 
imposing some other significant barrier to use, these tools 
actually reward the programmer for documenting her code. 

Second, a culture of documentation is engendered. Prompted 
by the example of vendors like Sun, programmers enjoy the 
creation and use of the attractive automatically-generated 
documentation in a web page format. This documentation- 
centric style is only strengthened by the exhibitionist nature 
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of the Web. Having the most complete documentation is 
now a point of pride in some Open Source projects; a state 
of affairs we would not have guessed at a decade ago. 

The primary problem with these systems, and the documen- 
tation and code written using them, is that even semi-structured 
comments have no semantics. Programmers are attempting 
to state (sometimes quite complex) knowledge but are not 
given the language and tools with which to communicate 
this knowledge. And since the vast majority of developers 
are unwilling to learn a new, especially formal, language 
with which to convey such information, we must look for 
a happy-medium of informal formality . 

That compromise, the delicate balance between informality 
and formality, and the lightest-weight aspect of our Knowl- 
edgeable Software Engineering program, is what we call se- 
mantic properties. 

Semantic properties are domain-independent documentation 
constructs with intuitive formal semantics that are mapped 
into the semantic domain of their application. Semantic prop- 
erties are used as if they were normal semi-structured docu- 
mentation. But, rather than being ignored by compilers and 
development environments as comments typically are, they 
have the attention of augmented versions of such tools. Se- 
mantic properties embed a tremendous amount of concise 
information wherever they are used without imposing the of- 
ten insurmountable overhead seen in the introduction of new 
languages and formalisms for similar purposes. 



2. SEMANTIC PROPERTIES 

The original inspiration for semantic properties came from 
three sources: the useoftags, (e.g., Sauthor and @param), 
in the Javadoc system, the use of annotations and pragmas 
in languages like Java and C for code transformation and 
guided compilation, and indexing clauses in Eiffel. All of 
these systems have a built-in property-value mechanism, one 
at the documentation level and one within the language syn- 
tax itself, that is used to specify semi-structured information. 

In Java, tags are the basic realization of our semantic proper- 
ties. They are used for documentation and form al spec ifica- 
tion, as we will discuss in more detail in Section B.l.ll Tags 
are not part of the language specification. In fact, they are 
entirely ignored by all Java compilers. 

Annotations and pragmas come in the form of formal tags 
used for some Design by Contract tools like Jasfl which 
happen to be realized in Eiffel with first-class keywords like 
require, ensure, and invariant. 

Eiffel provides first-class support for properties via indexing 
clauses. An Eiffel file can contain arbitrary property-value 
pairs inside of indexing blocks. This information is used 
by a variety of tools for source code search, organization, 
and documentation. 



3 http://semantik.informatik. uni-oldenburg.de/ jass/ 



2.1 Documentation Semantics 

Recently, Sun has started to extend the semantics of these 
basic properties with respect to language semantics, particu- 
larly with regards to inheritance. If a class C inherits from a 
parent class P, and P's method m has some documentation, 
but C's overridden or effective (in the case where P and/or 
m is abstract) version of m does not, then Javadoc inherits 
C.m's documentation for P.m, generating the appropriate 
comments in Javadoc 's output. 

This change in behavior of the tools is an implicit change in 
the semantics of the documentation. While straightforward 
and useful in this example, the meaning of such inheritance 
is undocumented and often unclear. 

The situation in Eiffel is less confusing. The semantics of 
properties, as realized by indexing clauses and formal pro- 
gram annotation via contracts, are defined in the language 
standard l; 23l . 

Even so, no mechanism exists in either system for extending 
these semi-structured comments with new semantics beyond 
simple plug-ins for documentation (e.g., doclets in Java and 
translators in EiffelStudio). 

Also, the semantics of current annotations are entirely spec- 
ified within a particular language or formalism. No gen- 
eral purpose formalism has been used to express their extra- 
model semantics. 

2.2 Semantics of Semantic Properties 

We specify the semantics of semantic properties in a new 
formalism called kind theory. Kind theory is a logic used to 
describe, reason about, and discover reusable assets of arbi- 
trary sorts. Kind theory is an higher-order, autoepistemicfl 
paraconsistenQ, categorical logic with a type theoretic and 
algebraic model, and is described in full detail in Kiniry's 
dissertation 1151 . 

2.2. 7 A Brief Overview of Kind Theory 
Kind are classifiers used to describe reusable assets like pro- 
gram code, components, documentation, specifications, etc. 
Instances are realizations of kind — actual embodiments of 
these classifiers. For example, the paperback "The Portrait 
of the Artist as a Young Man" by James Joyce is an in- 
stance of kinds PaperbackBook, EnglishDocument, 
and others. 

In the context of semantic properties, our kinds are the se- 
mantic properties as well as the programming language con- 
structs to which the properties are bound. Our instances are 
the specific realizations of these kinds within a particular in- 
put structure, typically a programming or specification lan- 
guage. 

Kinds are described structurally using our logic in a num- 
ber of ways. Classification is covered with the inheritance 

4 Auto-epistemic: "representing or discussing self-knowledge". 
J Paraconsistent: "explicitly representing and reasoning with poten- 
tial and transitory logic inconsistency". 



operators < and < p ; structural relationships are formalized 
using the inclusion operators C p and D, equivalence has sev- 
eral forms = and < ; realization, the relationship between in- 
stances and kind, is formalized with the operators < r and :; 
composition is captured in several forms, ®, ©, and o; and 
interpretation, the translation of kind to kind or instances to 
instances, is realized with the operators and 

Semantics are specified in an autoepistemic fashion using 
what are called truth structures. Truth structures come in 
two forms: claims and beliefs. 

Claims are stronger than beliefs. A mathematically proven 
statement that is widely accepted is a claim. This phrasing 
is used because, for example, there are theorems that have a 
preliminary proof but are not yet widely recognized as being 
true. 

A statement that is universally accepted, but not necessarily 
mathematically proven, is also a claim. Claims are not nec- 
essarily mathematical formulas. The statement "the sun will 
rise tomorrow" is considered by the vast majority of listeners 
a true and valid statement, thus is classified as a claim rather 
than as a belief. 

Beliefs, on the other hand, range in surety from completely 
unsure to absolutely convinced. No specific metric is defined 
for the degree of conviction, the only requirement placed on 
the associated belief logic is that the belief degree form a 
partial order. 

We use kind theory to specify semantic properties because 
it provides us with an excellent model-independent (i.e., it 
is not bound to some specific programming language) reuse- 
centric formalism. Kind theory's whole purpose is the spec- 
ification of such reusable concepts. 

We have insufficient space to summarize kind theory here, 
so we will simply provide some basic definitions, examples, 
and motivation of its use within the context of semantic prop- 
erties. 



2.2.2 Kind Theory and Semantic Properties 
Using kind theory, we specify the semantic relationships be- 
tween specifications and their realization. With regards to 
semantic properties, these relationships formally explain which 
properties exist, how they can be structured, how they can be 
applied in a specific language or system, and how they are in- 
terpreted into alternative forms like documentation and test 
code. 

First, we denote the classification relationships between kinds 
using inheritance operators. Properties are classified using 
standard conceptual data modeling 1 9 35 1 and ontology en- 
gineering 1 1 8 30] techniques into a kind hierarchy. Details 
about these class ifica tions for semantic properties are dis- 
cussed in Section l23l 

Next, the structural relationships between kinds are specified 
using the inclusion operators. Structural relationship denote 



the contexts in which a kind can be used and how kinds are 
composed to create new kind s. W e discuss structural rela- 
tions in more detail in Section lZ^I 

Finally, equivalence relations and interpretations are defined 
on kinds. Equivalence relations help refine concepts embod- 
ied as kinds for particular domain models by folding and 
simplifying kind hierarchies. They also let the user pick rep- 
resentative structures, called canonical forms, that are used 
to represent and reason about (semi-)equivalent kind. 

Interpretations, which are structure-preserving functions, are 
defined to help capture notions of inheritance, equivalence, 
and other inter-domain translations. 

The formal aspects of kind theory, the specification of kind 
domains for things like semantic properties, is performed by 
an expert. The typical software engineer never needs to learn 
or witness the formalism to benefit from its availability. 

2. 2. 2. 7 A Kind Example 

Consider a loop in any standard programming language. Loops 
come in many syntactic forms. For example, in the C pro- 
gramming language there are three primary loop constructs: 
for, while, and do. Fundamentally, all loop constructs 
are equivalent to each other at some abstraction level; they 
are each just syntactic variations on a common theme. That 
theme is specified by the kind Loop. 

We classify loops as a computational structure. This classi- 
fication states that the notion of a loop is going to be bound 
to a specific syntax and can be interpreted in some program- 
matic context. We state this relation as 

Loop < ComputationalStructure 

By the rules of inheritance, all the structure inherent in the 
parent kind, that of COMPUTATIONALSTRUCTURE, is real- 
ized in the child kind Loop as well. 

Additionally, an interpretation exists of the form 

Loop ComputationalStructure 

that takes kinds or instances of loops to kinds or instances 
of computational structures, respectively. This function is 
called a partial interpretation because it eliminates all of 
the semantics of loops that differentiates them from general 
computational structures. Interpretations are realized by cat- 
egorical forgetful functors in kind theory. 

The structure of loops is straightforward. Each loop has an 
initial state, an increment function, a guard, and a body. We 
state this kind theoretically as 

InitialState c p Loop 
IncrementFunction c p Loop 
GuardPredicate c p Loop 
LoopBody Cp Loop 

Each of these kind, in turn, has its own structure associated 
with it. GuardPredicate < Predicate, for example. 



Interpretations let us do two primary things. First, we use 
interpretations to translate among different forms of loops, 
converting a while loop into a do loop, for example. Sec- 
ond, they are used for interpreting the generic semantics of 
loops in a specific language or formal context. 

For instance, a generic specification of a loop instance can be 
translated to and from a specific syntactic structure realized 
in the Java programming language. Additionally, a formally 
specified loop, (the kind FormalLoop), complete with a 
loop variant function and invariant predicate, can be trans- 
lated into a proof structure in our logical framework. This 
opens up the opportunity for statically proving the correct- 
ness of such formally specified loops. 

All of the above details are fully formalized in a kind domain 
in the basic kind system 1151 . 

A kind domain is simply a set of kind and instances that 
are specific to some domain of knowledge. In this case, that 
domain is one of language-generic computational structures, 
and we call that domain ComputationalStructures. 

A kind system is an actual computational system that real- 
izes kind theory. Our first realization is witnessed both in 
a logical framework, that of SRI's Maude |4|, as well as in 
software engineering tools like the Jikfl (used as an open, 
collaborative knowledge repository), the JPP 1161 . and the 
EBON tool suit^] (a design model checker). Som e of these 
tools are discussed in more detail in Sections [2.7l andl3l 

2.3 Properties and Their Classification 

We have defined elsewher^] thirty-five semantic properties. 
All semantic properties are enumerated in Tableau the ap- 
pendix. Since we have only a limited amount of space in 
this paper, we will summarize some of the more interesting 
properties, their semantics, and our experiences with their 
use in a number of software engineering projects, large and 
small, over the last five years. 

To derive our core set of semantic properties, we abstracted 
and unified the existing realizations that we have used in two 
languages for many years. First, we gathered the set of pre- 
defined Javadoc tags, the standard Eiffel indexing clauses, 
and the set of basic formal specification constructs. After 
that set was made self-consistent, that is, duplicates were 
removed, semantics were weakened across domains for the 
generalization, etc., we declared it the core set of semantic 
property kind. 

These properties were then classified according to their gen- 
eral use and intent. The classifications are: meta-information, 
pending work, contracts, concurrency, usage information, 
versioning, inheritance, documentation, dependencies, and 

B http ://w w w. j iki . org/ 
r http://ebon.sf.net/ 

8 The original specification of these properties 
was in the Infospheres Java Coding Standard I) 
http://www.infospheres.caltech.edu/. That standard has since 
been refined and broadened. More recent versions are available at 
KindSoftware I http://www.kindsoftware.com/ 1. 



miscellaneous. This classification is represented using kind 
theory's inheritance operators. 

Many of these semantic properties are used solely for docu- 
mentation purposes. The title property documents the ti- 
tle of the project with which a file is associated; the description 
property provides a brief summary of the contents of a file. 
We call these informal semantic properties. 

Another set of properties are used for specifying non-program- 
matic semantics. By "non-programmatic" we mean that the 
properties have semantics, but they are not, or cannot, be 
expressed in program code. For example, labeling a con- 
struct with a copyright or license property specifies 
some legal semantics. Tagging a method with a bug prop- 
erty specifies that the method has some erroneous behavior 
that is described in detail in an associated bug report. We call 
these properties semi-formal because they have a semantics, 
but outside of the domain of software. 

Finally, the balance of the properties specify structure that 
is programmatically testable, checkable, or verifiable. Ba- 
sic examples of such properties are require and ensure 
tags for preconditions and postconditions, modifies tags 
for side-effect semantics, and the concurrency and generate 
tags for concurrency semantics. These properties are called 
formal because they can be realized by a formal semantics. 

The KindSoftware coding standard [?] summarizes our cur- 
rent set of semantic properties. Each property has a syntax, a 
correct usage domain, and a natural language summary. The 
formalization of semantic properties is found in Kiniry's dis- 
sertation 1 151 . 

2.4 Context 

Each property has a legal scope of use, called its context. 
Contexts are defined in a coarse, language-independent fash- 
ion using inclusion operators in kind theory. Contexts are 
comprised of files, modules, features , and variables. 

Files are exactly that: data files in which program code re- 
sides. The scope of a file encompasses everything contained 
in that file. 

A module is some large-scale program unit. Modules are 
typically realized by an explicit module- or class-like struc- 
ture. Examples of modules are classes in object-oriented 
systems, modules in languages of the Modula and ML fam- 
ilies, packages in the Ada lineage, etc. Other words and 
structures typically bound to modules include units, proto- 
cols, interfaces, etc. 

Features are the entry point for computation. Features are 
often named, have parameters, and return values. Functions 
and procedures in structured languages are features, as are 
methods in object-oriented languages, and functions in func- 
tional systems. 

Finally, variables are program variables, attributes, constants, 
enumerations, etc. Because few languages enforce any ac- 



cess principles for variables, variables can vary in semantics 
considerably. 

Each property listed in the appendix has a legal context. 
The context All means that the property can be used at the 
file, module, feature, or variable level. Additional contexts 
can be defined, extending the semantics of contexts for new 
programming language constructs that need structured doc- 
umentation with properties. 

2.5 Visibility 

In languages that have a notion of visibility, a property's vis- 
ibility is equivalent to the visibility of the context in which 
it is used, augmented by domain-specific visibility options 
expressed in kind theory. 

Typical basic notions of visibility include public, private, 
children (for systems with inheritance), and module (e.g., 
Java's package visibility). More complex notions of visi- 
bility are exhibited by C++'s notion of friend and Eiffel's 
class-based feature scoping. 

Explicit visibilities for semantic properties are used to re- 
fine the notion of specification visibility for organizational, 
social, and formal reasons. 

For example, a subgroup of a large development team might 
choose to expose some documentation for, and specification 
of, their work only to specific other groups for testing, polit- 
ical, or legal reasons. 

On the social front, new members of a team might not have 
yet learned specific tools or formalisms used in semantic 
properties, so using visibility to hide those properties will 
help avoid information overload. 

Lastly, some formal specification, especially when viewed 
in conjunction with standard test strategies (e.g., whitebox, 
greybox, blackbox, unit testing, scenario-based testing), has 
distinct levels of visibility. For example, testing the postcon- 
dition of a private feature is only reasonable and permissible 
if the testing agent is responsible for that private feature. 

2.6 Inheritance 

Semantic properties also have a well-defined notion of prop- 
erty inheritance. Once again, we do not want to force new 
and complicated extra-language semantics on the software 
engineer. Therefore, property inheritance semantics match 
those of the source language in which the properties are 
used. Our earlier discussion of basic comments for Java 
methods (a feature property context) is an example of such 
property inheritance. 

These kinds of inheritance come in two basic forms: re- 
placement and augmentation. 

The replacement form of inheritance means that the parent 
property is completely replaced by the child property. An 
example of such semantics are feature overriding in Java and 
the associated documentation semantics thereof. 



Augmentation, on the other hand, means that the child's prop- 
erties are actually a composition of all its parents' properties. 
These kinds of composition come in several forms. The most 
familiar is the standard substitution principle-based type se- 
mantics 1 21 1 in many object-oriented systems, and the Hoare 
logic/Dijkstra calculus-based semantics of contract refine- 
ment 1 24 1 . 

We can express these formal notions using kind theory be- 
cause it is embedded in a complete logical framework. For 
example, we can automatically reason about the legitimacy 
of specification refinement much like Findler and Felleisen 
discuss in Q. 

2.7 Tool Support 

We have used these semantic properties for the last five years. 
We have found that, while an explicit adopted coding stan- 
dard, positive feedback via tools and peers, and course grade 
and monetary rewards goes a long way toward raising the 
bar for documentation and specification quality, these social 
aspects are simply not enough. Process does help, regular 
code reviews and pair programming in particular, but tool 
support is critical to maintaining quality specification cover- 
age, completeness, and consistency. 

Templates were the first step taken. We have used raw docu- 
mentation and code templates in programming environments 
ranging from vi to emacs to jEdit. But templates only help 
prime the process, they do not help maintain the content. 

Code and comment completion also helps. Completion is the 
ability of an environment to use partial input to derive a typ- 
ically more lengthy full input. We have experimented with 
augmented versions of completion in emacs, for example. 

Likewise, documentation lint checkers, particularly those em- 
bedded in development environments and documentation gen- 
erators are also useful. We view source text highlighting, as 
in font-lock mode in emacs, as an extremely weak form of 
lint-checking. The error reports issued by Javadoc and its 
siblings are a stronger form of lint-checking and are quite 
useful for documentation coverage analysis, especially when 
a part of the regular build process. Finally, scripts integrated 
into a revision control system provide a "quality firewall" to 
a source code repository in much the same fashion. 

We believe that more can and should be done. Our approach 
is to build and use what we call Knowledgeable Develop- 
ment Environments (KDEs). These development environ- 
ments use knowledge representation and formal reasoning 
behind the scenes to help the user work smarter and not 
harder. 

We have started work on such environment. By extending 
powerful emacs modes and tools that are part of our ini- 
tial development environment (e.g., XEmacs coupled with 
the object-oriented browser, hyperbole, JDE, semantic, and 
speedbar) with a kind system, we hope to raise the bar on 
development environments. 



2. 7. 1 Current Work on KDEs 

The first two features that we plan to implement are docu- 
mentation inheritance and perspective. 

Eiffel development environments contain tools that provide 
what are called the flat, short, and contract views of a class. 
Flat forms show the flattened version of a class — all inher- 
ited features are flattened into a single viewpoint. The short 
form eliminates the implementation of all methods so that 
the developer can focus on a class's interface. The contract 
form is like the short form except the contracts of the class 
(feature preconditions and postcondition, and class invari- 
ants) are shown. These forms can be combined, thus flat 
short ox flat contract forms have the obvious meanings. 

Knowledgeable documentation inheritance is an extended 
version of such views. Rather than manually program the 
semantics of the "flattening" operation, our formal specifica- 
tion in kind theory automatically interprets the appropriate 
instances into a new form for rendering within the knowl- 
edgeable development environment. And because such in- 
terpretations are often fully reversible, the flattened forms 
can be edited and the changes will properly "percolate" to 
their original source locations. 

Perspectives enable the user to specify which role(s) she is in 
while interacting with the kind-enabled system. Since kind 
theory is autoepistemic, the specification of a role (repre- 
sented by an agent within the theory) permits automatic "fil- 
tering" of information according to, for example, visibility 
rules as discussed in Section l2~5l This user-centric filtering 
of information, much like narrowing modes within Emacs, 
helps the user focus on the problem at hand, ignoring all in- 
formation that she either is not interested in, concerned with, 
or should not see. 

These are only two of our ideas for how to expose the user- 
centric aspects of kind theory via development environments, 
incorporating the use of semantic properties throughout. 

3. EMBEDDING SEMANTIC PROPERTIES 

When a semantic property is bound to a particular instance, 
for example, an @ author tag is used in some Java source 
code, what does this formally mean beyond questions of 
structural conformance? How do these semantic properties 
help guide the development process and exercise the system 
during testing? How do new tools take advantage of these 
properties? 

First, we have to embed the semantic properties into the lan- 
guage in which we are working. Second, we need to de- 
fine domain-specific semantics using kind interpretations. 
Lastly, we use kind theory's belief truth structures to guide 
program development. 

We will first look at syntactic embedding for two program- 
ming and one specification language. In the latter parts of 
this next section we will address the other two points. 



We have used semantic extensions in two programming lan- 
guages: Java and Eiffel. 

3.1.1 Java 

Semantic properties are embedded in Java code using Javadoc- 
style comments. This makes for a simple, parseable syn- 
tax, and the kind composition of semantic properties to con- 
structs is simply realized by textual concatenation. 

Here is an example of such use, taken directly from one of 
our projects that uses semantic properties 1 12 1. 

/** 

* Returns a boolean indicating whether any debugging 

* facilities are turned off for a particular thread. 

* gconcurrency GUARDED 

* @require (thread != null) Parameters must be valid. 

* Smodify QUERY 

* @param thread we are checking the debugging condition 

* of this thread. 

* @return a boolean indicating whether any debugging 

* facilities are turned off for the specified thread. 

* @review kiniry Are the isOfff) methods necessary at all? 
**/ 

public synchronized boolean isOff {Thread thread) 
{ 

return (! isOn (thread) } ; 

} 



Existing tools already use these properties for translating 
specifications, primarily in the form of contracts, into run- 
time test code. Reto Kramer's iContract 1191 . the Univer- 
sity of Oldenburg's Semantic Group's Jass tool, Findler and 
Felliason's contract soundness checking tool |5 . and Kiniry 
and Cheong's JPP 1 16 1 are three such tools. 

3.1.2 Eiffel 

In Eiffel, as mentioned earlier, we use indexing clauses as 
well as regularly structured comments to denote semantic 
properties. Using comments as well as indexing clauses is 
necessary because the syntax of Eiffel dictates that indexing 
clauses only appear at the top of a source file. The syntax 
of comments that use semantic properties is identical to that 
of indexing clauses, thus the same parser code can be used 
in both instances. An example of such use is as follows, 
directly from one of our Eiffel-based projects that uses se- 
mantic properties 1 131 • 

indexing 

description: "The Extended BON scanner." 

project: "The Extended BON Tool Suite" 

author: "Joseph R. Kiniry <kiniry@acm. org>" 

copyright: "Copyright (C) 2001 Joseph R. Kiniry" 

version: "$Revision: 1.10 $" 

license: "Eiffel Forum Freeware License vl" 

3.2 Specification Languages 

We have also used semantic properties to extend the BON 
specification language. 



3.1 Programming Languages 



3.2.1 BON 



BON stands for the Business Object Notation. BON is de- 
scribed in whole in Walden and Nerson's Seamless Object- 
Oriented Software Architecture 1311 . extended from an ear- 
lier paper by Nerson l25l . 

3.2.1.1 Primary Aspects 

BON is an unusual specification language in that it is seam- 
less, reversible, and focuses on contracting. BON also has 
both a textual and a graphical form. 

BON is seamless because it is designed to be used during all 
phases of program development. Multiple refinement levels 
(high-level design with charts, detailed design with types, 
and dynamism with scenarios and events), coupled with ex- 
plicit refinement relationships between those levels, means 
that BON can be used all the way from domain analysis to 
unit and system testing and code maintenance. 

Reversibility summarizes the weak but invertible nature of 
BON's semantics. By virtue of its design, every construct 
described in BON is fully realizable in program code. One 
can specify system structure, types, contracts, events, sce- 
narios, and more. Each of these constructs can not only be 
interpreted into program code, but program code can be in- 
terpreted into BON. As far as we are aware, this makes BON 
unique insofar as, with proper tool support, a system speci- 
fication need not become out-of-date if it is written in BON. 

Finally, BON focuses on software contracts as a primary 
means of expressing system semantics. These contracts have 
exactly the same semantics as discussed earlier with regards 
to object-oriented models, because BON is an object-oriented 
specification language. 

BON's semantics were originally specified informally using 
Eiffel 12.2113 II . Paige and Ostroff recently provided an anal- 
ysis of BON with an eye toward a refinement-centric formal 
semantics 1261 1271 . 

3.2.1.2 BON Technologies 

BON has been, and is being, used within several commercial 
and Open Source tools: Interactive Software Engineering's 
EiffelCase and EiffelStudicQ tools; Ehrke's BonBon CASE 
tool; Steve Thompson and Roy Phillips's BONBAZ/Envision 
project; Kaminskaya's BON static diagram tool 1 1 1 1; Paige, 
Lancaric, and Ostroff 's BON CASE tool; and Kiniry's EBON 
tool suite f!4l . 

The last three are particularly exciting projects because they 
are currently active and have wide applicability. Kamin- 
skaya's and Lancaric's tools generate textual BON, JML, 
Eiffel, and Java source code from a graphical BON speci- 
fication. 

3.2.2 Extended BON 

Kiniry's EBON tool suite has a different aim. Its secondary 
purpose is similar to other previously mentioned tools, namely 
the generation of documentation from BON specifications. 

9 http ://w w w. eif f el . com/products/ studio5 1 / 



But its primary use is design model checking for Eiffel and 
Java code. 

BON is extended with our set of semantic properties by (a) 
extending the BON language (adding new keywords and ex- 
pressions), (b) using structured comments, and (c) using in- 
dexing clauses like those in Eiffel. More information on 
these specific extensions is available at the EBON web site 1 1 4 1 . 

3.2.2.1 Domain-Specific Semantics 

Translations from BON to a source language and vice-versa 
are to be represented by kind theory interpretations. This 
means that changes to either side of the translation can not 
only be translated, but can be checked for validity accord- 
ing to its (dual) model. This specification-code conformance 
(validity) checking is what we call design model checking. 
We use this terminology because the specification is the the- 
ory and the program code is the model, when viewed from 
the logical perspective. 

A change in the source code that is part of a EBON inter- 
pretation image will automatically trigger a corresponding 
change in the EBON specification. Likewise, any change in 
the EBON specification will automatically trigger a corre- 
sponding change in the source code. 

Some of these translations entail more than just a transfer of 
information from a specification to a comment in the source 
code. For example, an invariant semantic tag is inter- 
preted not only as documentation, but also as run-time test 
code. We do not have the space in this paper to detail this in- 
terpretation. It follows the same lines as related tools that 
support contract-based assertions in Java and Eiffel men- 
tioned elsewhere in this paper. 

3. 2. 2. 2 Belief-Driven Development 

Some BON extensions are non-reversible because they rep- 
resent system aspects that are either very difficult or impos- 
sible to derive. For example, the time-complexity se- 
mantic property specifies the computational complexity of 
a feature. It is (rarely) possible to extract such information 
from an algorithm with automated tools. But the fact is, the 
algorithm author often knows her algorithm's complexity. 
Thus, stating the complexity as part of the algorithm specifi- 
cation with a semantic property is easy and straightforward 
task. 

Now the question arises: How do we know such specifica- 
tions that are not automatically checkable remain valid? This 
is where the earlier-mentioned belief truth structures of kind 
theory come into play. 

When the programmer writes the original time-complexity 
semantic property for a feature, she is stating a belief about 
that feature. Beliefs in kind theory are autoepistemic (the 
representation of the programmer is part of the logical sen- 
tence encoding the belief), have an associated "strength" or 
"surety" metric (recall Section 12. 2. 1> . and include a set of 
evidence supporting the belief. 



We use a number of techniques to ensure that old or out-of- 
date beliefs are rechecked. With regards to this example, we 
define a continued validity condition as part of the evidence, 
which is machine checkable. Currently, if the program code 
or documentation to which the complexity metric belief is 
bound radically changes in size, or if the feature has a change 
in type, author, or other potentially complexity-impacting 
specification (e.g., concurrency, space-complexity, etc.), then 
the validity condition is tripped and the developer is chal- 
lenged to re-check and validate the belief, restarting the pro- 
cess. 

4. EXPERIENCES 

We have used semantic properties within our research group, 
in the classroom, and in two corporate settings. 

The Compositional Computing Group at Caltech has used 
semantic properties in our complex, distributed and concur- 
rent architectures, written in Java and Eiffel, over the past 
five years. We have witnessed their utility by first-hand ex- 
perience primarily during the introduction of our complex 
technologies to new students and collaborators is particu- 
larly facilitated by semantic properties. 

Students grumble at first when they are told that their com- 
ments now have a precise syntax and a semantics. The stu- 
dents initially think of this as being "just more work" on their 
part — yet another reason to hand in a late assignment. But, 
as the term goes by, the students incorporate the precise doc- 
umentation with semantic properties and related tools into 
their development process. After a few weeks of indoctrina- 
tion, they not only stop complaining, but start praising the 
process and tools. We generally see higher quality systems 
and the students report spending less time on their home- 
work than when they started the course. They have learned 
how to work smarter, not harder. 

These languages, process, and tools were also used in a cor- 
porate setting to develop a enormously complex, distributed, 
concurrent architecture. When showing the system to poten- 
tial funders and collaborators, being able to present the sys- 
tem architecture and code with this level of specification and 
documentation invariably increased our value proposition. 
Uniformly, investors were not only shocked that a startup 
would actually design their system, but to think that we used 
lightweight formal methods to design, build, test, and docu- 
ment the system was absolutely unheard of. 

We have incorporated feedback from these three domains 
into our work. Our set of semantic properties is still evolv- 
ing, albeit at a rapidly decreasing rate. Our tools see re- 
finement for incorporation into new development processes, 
better error handling, and more complete and correct func- 
tionality. This user feedback is essential to understanding 
how these technologies and theory can be exposed in aca- 
demic and industrial settings. 

5. CONCLUSION 

Documentation reuse is most often discussed in the literate 
programming |3| and hypertext domains |6|. Little research 



exists for formalizing the semantics of semi-structured doc- 
umentation. Some work in formal concept analysis and re- 
lated formalisms 1 29 , 34 1 has started down this path, but with 
extremely loose semantics and little to no tool support. 

Recent work by Wendorff 1 32 33 1 bears resemblance to this 
work both in its nature (that of concept formation and reso- 
lution) and theoretic infrastructure (that of category theory). 
Our work is differentiated by its broader scope, its more ex- 
pressive formalism, and its realization in tools. Additionally, 
the user-centric nature of kind theory (not discussed in this 
article) makes for exposing the formalism to the typical soft- 
ware engineer a straightforward practice. 

Our next steps are on two fronts. First, we are interested in 
embedding our semantic properties in the Java-centric spec- 
ification language JML. Second, we are continuing to de- 
velop new tools and technologies to realize knowledgeable 
development environments that use kind theory as a formal 
foundation. 

5.1 JML 

JML is the Java Modeling Language |20|. JML is a Java- 
and model-centric language in the same tradition as Larch 
and VDM. JML is used to specify detailed semantic aspects 
of Java code and some tool support exists for type-checking 
and translating these specifications into documentation and 
run- time test code |2, 7 28 1. The formal semantics of JML 
have been partially specified via a logic as part of the LOOP 
project 1101 . 

Extending JML with semantic properties would follow the 
same course that we have used for BON. Because we already 
have integrated semantic properties with Java, and given the 
existing tool support for JML, we should be able to real- 
ize inter-domain interpretations that preserve a vast amount 
of information about JML-specified Java systems using kind 
theory. 

5.2 Social Implications 

We expect that knowledgeable development environments 
will have social implications for software development. 

First, this challenging, interactive style imposed by knowl- 
edgeable development environments is not typical — we have 
to make sure that we are not introducing some kind of for- 
mal methods "paper clip". Thus, the environment needs to 
"tune" itself to the interactive style and development process 
of the user. We look forward to theoretically representing 
such styles in kind theory so that tuning is simply part of the 
logical context. 

Second, in our extensive experience in the research lab, class- 
room, and corporate office, we have witnessed the fact that 
most developers are very uncomfortable starting from scratch, 
especially with regards to system documentation and infor- 
mal and formal specifications. If some existing documenta- 
tion or specification exists, developers are much more likely 
to continue in that trend because they feel that they are con- 
tributing rather than creating. 



Because the EBON tool suite will automatically generate 
a base specification from program code, and because the 
specification-code validity conformance is automatically main- 
tained, we have a primer as well as a positive feedback cycle 
for lightweight specification with semantic properties. Only 
time and experience will tell whether this is a sufficient fire 
to light the correct software fuse. 

5.3 Knowledgeable Environments 

As mentioned previously, our work on KDEs continues. 

We wish to augment our already powerful development en- 
vironment in two ways. Our first step entails integrating 
an interactive front-end like XEmacs with our kind system 
realized in Maude. The availability of Emacs-centric tools 
for proof system like the excellent Proof GeneraJ3 make 
this a relatively straightforward exercise. The most time- 
consuming aspect is writing interpretation engines that trans- 
late annotated source code to and from a kind representation 
format. Several such tools are being prototyped now 1141 

EE 

We also plan on integrating these environments with our 
reusable knowledge repository known as the Jiki 1131 . The 
Jiki is a read/write web architecture realized as a distributed 
component-based Java Wiki. All documents stored in the 
Jiki are represented as instances of kind. Manipulating Jiki 
assets, including adding or deleting information or searching 
for reusable assets, is realized through a forms-based web in- 
terface as well as through a Java component-based API. 
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APPENDIX 

A. SEMANTIC PROPERTIES SUMMARY 
Table 1: The Full Set of Semantic Properties 



Met a- Information : 


Contracts 


Versioning 


author 


ensure 


version 


bon 


generate 


deprecated 


bug 


invariant 


since 


copyright 


modifies 


Documentation 


description 


require 


design 


history 


Concurrency 


equivalent 


license 


concurrency 


example 


title 


Usage 


see 


Dependencies 


param 


Miscellaneous 


references 


return 


guard 


use 


exception 


values 


Inheritance 


Pending Work 


time-complexity 


hides 


idea 


space-complexity 


overrides 


review 






todo 
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